AITopics | trec ikat 2023

Collaborating Authors

trec ikat 2023

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Can We Use Large Language Models to Fill Relevance Judgment Holes?

Abbasiantaeb, Zahra, Meng, Chuan, Azzopardi, Leif, Aliannejadi, Mohammad

arXiv.org Artificial IntelligenceMay-9-2024

Incomplete relevance judgments limit the re-usability of test collections. When new systems are compared against previous systems used to build the pool of judged documents, they often do so at a disadvantage due to the ``holes'' in test collection (i.e., pockets of un-assessed documents returned by the new system). In this paper, we take initial steps towards extending existing test collections by employing Large Language Models (LLM) to fill the holes by leveraging and grounding the method using existing human judgments. We explore this problem in the context of Conversational Search using TREC iKAT, where information needs are highly dynamic and the responses (and, the results retrieved) are much more varied (leaving bigger holes). While previous work has shown that automatic judgments from LLMs result in highly correlated rankings, we find substantially lower correlates when human plus automatic judgments are used (regardless of LLM, one/two/few shot, or fine-tuned). We further find that, depending on the LLM employed, new runs will be highly favored (or penalized), and this effect is magnified proportionally to the size of the holes. Instead, one should generate the LLM annotations on the whole document pool to achieve more consistent rankings with human-generated labels. Future work is required to prompt engineering and fine-tuning LLMs to reflect and represent the human annotations, in order to ground and align the models, such that they are more fit for purpose.

judgment, relevance judgment, trec ikat 2023, (13 more...)

arXiv.org Artificial Intelligence

2405.056

Country:

Europe > Netherlands > North Holland > Amsterdam (0.05)
North America > United States > New York > New York County > New York City (0.05)
South America > Brazil > Bahia > Salvador (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Add feedback

TREC iKAT 2023: A Test Collection for Evaluating Conversational and Interactive Knowledge Assistants

Aliannejadi, Mohammad, Abbasiantaeb, Zahra, Chatterjee, Shubham, Dalton, Jeffery, Azzopardi, Leif

arXiv.org Artificial IntelligenceMay-4-2024

Conversational information seeking has evolved rapidly in the last few years with the development of Large Language Models (LLMs), providing the basis for interpreting and responding in a naturalistic manner to user requests. The extended TREC Interactive Knowledge Assistance Track (iKAT) collection aims to enable researchers to test and evaluate their Conversational Search Agents (CSA). The collection contains a set of 36 personalized dialogues over 20 different topics each coupled with a Personal Text Knowledge Base (PTKB) that defines the bespoke user personas. A total of 344 turns with approximately 26,000 passages are provided as assessments on relevance, as well as additional assessments on generated responses over four key dimensions: relevance, completeness, groundedness, and naturalness. The collection challenges CSA to efficiently navigate diverse personal contexts, elicit pertinent persona information, and employ context for relevant conversations. The integration of a PTKB and the emphasis on decisional search tasks contribute to the uniqueness of this test collection, making it an essential benchmark for advancing research in conversational and interactive knowledge assistants.

assessment, information, trec ikat 2023, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626772.3657860

2405.02637

Country:

North America > United States > District of Columbia > Washington (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

TREC iKAT 2023: The Interactive Knowledge Assistance Track Overview

Aliannejadi, Mohammad, Abbasiantaeb, Zahra, Chatterjee, Shubham, Dalton, Jeffery, Azzopardi, Leif

arXiv.org Artificial IntelligenceJan-2-2024

Conversational Information Seeking stands as a pivotal research area with significant contributions from previous works. The TREC Interactive Knowledge Assistance Track (iKAT) builds on the foundational work of the TREC Conversational Assistance Track (CAsT). However, iKAT distinctively emphasizes the creation and research of conversational search agents that adapt responses based on user's prior interactions and present context. The challenge lies in enabling Conversational Search Agents (CSA) to incorporate this personalized context to efficiency and effectively guide users through the relevant information to them. iKAT also emphasizes decisional search tasks, where users sift through data and information to weigh up options in order to reach a conclusion or perform an action. These tasks, prevalent in everyday information-seeking decisions -- be it related to travel, health, or shopping -- often revolve around a subset of high-level information operators where queries or questions about the information space include: finding options, comparing options, identifying the pros and cons of options, etc. Given the different personas and their information need (expressed through the sequence of questions), diverse conversation trajectories will arise -- because the answers to these similar queries will be very different. In this paper, we report on the first year of TREC iKAT, describing the task, topics, data collection, and evaluation framework. We further review the submissions and summarize the findings.

information, ptkb statement, retrieval, (14 more...)

arXiv.org Artificial Intelligence

2401.0133

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.06)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(2 more...)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback